Personality Mining from Biographical Data with the "Adjectival Marker" Technique
نویسندگان
چکیده
The last decade has witnessed significant work in personality mining from lexical cues in social media data. Not much work has yet been undertaken in extracting these lexical cues from biographical data populating social media. Most of this work involves a large crowd of researchers leveraging dictionary-based approaches such as LIWC (which primarily focus on function words). By means of this paper we intend to introduce a novel method of personality mining from social media data called “Adjectival-marker Technique”. This method involves extracting lexical features from descriptive texts (e.g. biographical data) to train a learning model, so as to predict the respective personality traits of the subject. Conceptually, it draws heavily from the last 78 years of work in lexical psychology and the Big Five personality test. However, it is not only a computational variant of the primordial theories of lexical psychology, but is also competent in conferring a substantial accuracy of personality prediction, matching that obtained by psychometric tests. In this study, we propose a variant of the Lexical Hypothesis from psychology. This modified hypothesis is validated by the computational results of personality prediction achieved by the Adjectival Marker Technique discussed below. The paper also discusses some insights illustrating the coherence of people's judgments about the subject's personality (virtual personality). The average accuracy (i.e. matching that achieved by psychometric tests for Big 5) for prediction approximated to Extraversion 82.82% Agreeableness 89.62%, Conscientiousness 92.48% and Imaginativeness/Intellect 81.67%.
منابع مشابه
Introducing an algorithm for use to hide sensitive association rules through perturb technique
Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...
متن کاملTolstoy Digital: Mining Biographical Data in Literary Heritage Editions
This paper presents a solution for mining the biographical information from commentaries on Leo Tolstoy’s letters. It is implemented as a part of Tolstoy Digital Project – a semantically marked-up web publication of the 90-volume complete collection of Leo Tolstoy’s works. Extraction of relevant biographical information will be used to create an open database for all the persons who were someho...
متن کاملExplanation of Relationships between Biographical Characteristics and Entrepreneurship Spirit of Students
Three major causes of the importance of entrepreneurship are making wealth, developing technology and creating productive employment. It is generally believed that a revolution is needed for entrepreneurship to take place in societies nowadays. Thus, the present study aims at investigation of the biographical characteristics and explanation of its relation to entrepreneurial spirit at Mazandara...
متن کاملToward Algorithmic Discovery of Biographical Information in Local Gazetteers of Ancient China
Difangzhi (地方志) is a large collection of local gazetteers complied by local governments of China, and the documents provide invaluable information about the host locality. This paper reports the current status of using natural language processing and text mining methods to identify biographical information of government officers so that we can add the information into the China Biographical Dat...
متن کاملUsing Bacillus Cereus as a Geo-Biological Marker For Gold Prospecting in Iran
Several methods have been developed for gold exploration in the past, among which biological base method is known to be the most efficient with least expenses. This method can also be used for latent gold prospects exploration. In the present study, the possibility of applying Bacillus cereus frequency in soil as a biological marker was investigated for the exploration of latent gold prospectin...
متن کامل